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ABSTRACT 



In a processor speculatively executing instructions which 
specify logical addresses, a method and apparatus for specu- 
latively converting logical addresses to physical addresses. 
The processor has a register window movable within a 
register file, a window pointer register maintaining a value 
corresponding to the location of the window in the register 
file, a speculative window pointer register maintaining a 
speculative value of the window pointer register. A control- 
ler identifies an instruction expected to modify the value in 
the window pointer register, and in response to identifying 
the instruction the controller modifies the speculative value. 
A mapper, coupled to the speculative window pointer 
register, converts the instruction specified logical addresses 
to physical addresses based on the speculative value con- 
tained in the speculative window pointer register. 

16 Claims, 7 Drawing Sheets 
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APPARATUS FOR HANDLING REGISTER 
WINDOWS IN AN OUT-OF-ORDER 
PROCESSOR 

CROSS-REFERENCES TO RELATED 
APPLICATIONS 

The subject matter of the present application is related to 
that of co-pending U.S. patent application Ser. No. 08/881, 
958 identified as Docket No. P2345/37 178. 83007 1.000 for 
AN APPARATUS FOR HANDLING ALIASED 
FLOATING-POINT REGISTERS IN AN OUT-OF- 
ORDER PROCESSOR filed concurrendy herewith by 
Ramesh Panwar; Ser. No. 08/881,729 identified as Docket 
No. P2346/37178.830072.000 for APPARATUS FOR PRE- 
CISE ARCHITECTURAL UPDATE IN AN OUT-OF- 
ORDER PROCESSOR filed concurrendy herewith by 
Ramesh Panwar and Arjun Prabhu; Ser. No. 08/881,726 
identified as Docket No. P2348/37 178 .830073.000 for AN 
APPARATUS FOR NON-INTRUSIVE CACHE FILLS 
AND HANDLING OF LOAD MISSES filed concurrendy 
herewith by Ramesh Panwar and Ricky C. Hetherington; 
Ser. No. 08/881,908 identified as Docket No. P2349/ 
37178.830074.000 for AN APPARATUS FOR HANDLING 
COMPLEX INSTRUCTIONS IN AN OUT-OF-ORDER 
PROCESSOR filed concurrendy herewith by Ramesh Pan- 
war and Dani Y. Dakhil; Ser. No. 08/882,173 identified as 
Docket No. P2350/37178.830075.000 for AN APPARATUS 
FOR ENFORCING TRUE DEPENDENCIES IN AN OUT- 
OF-ORDER PROCESSOR filed concurrently herewith by 
Ramesh Panwar and Dani Y. Dakhil; Ser. No. 08/881,145 
identified as Docket No. P235 1/37178.830076.000 for 
APPARATUS FOR DYNAMICALLY RECONFIGURING 
A PROCESSOR filed concurrently herewith by Ramesh 
Panwar and Ricky C. Hetherington; Ser. No. 08/881,732 
identified as Docket No. P2353/37178.830077.000 for 
APPARATUS FOR ENSURING FAIRNESS OF SHARED 
EXECUTION RESOURCES AMONGST MULTIPLE 
PROCESSES EXECUTING ON A SINGLE PROCESSOR 
filed concurrently herewith by Ramesh Panwar and Joseph 
I. Chamdani; Ser. No. 08/882,175 identified as Docket No. 
P2355/37178.830078.000 for SYSTEM FOR EFFICIENT 
IMPLEMENTATION OF MULTI-PORTED LOGIC FIFO 
STRUCTURES IN APROCESSOR filed concurrendy here- 
with by Ramesh Panwar; Ser. No. 08/882,311 identified as 
Docket No. P23 65/37178.830080.000 for AN APPARATUS 
FOR MAINTAINING PROGRAM CORRECTNESS 
WHILE ALLOWING LOADS TO BE BOOSTED PAST 
STORES IN AN OUT-OF-ORDER MACHINE filed con- 
currently herewith by Ramesh Panwar, P. K. Chidambaran 
and Ricky C. Hetherington; Ser. No. 08/881,731 identified 
as Docket No. P2369/371 78 .830081.000 for APPARATUS 
FOR TRACKING PIPELINE RESOURCES IN A SUPER- 
SCALAR PROCESSOR filed concurrently herewith by 
Ramesh Panwar; Ser. No. 08/882,525 identified as Docket 
No. P2370/37178.830082.000 for AN APPARATUS FOR 
RESTRAINING OVER-EAGER LOAD BOOSTING IN 
AN OUT-OF-ORDER MACHINE filed concurrently here- 
with by Ramesh Panwar and Ricky C. Hetherington; Ser. 
No. 08/881,847 identified as Docket No. P2372/ 
37178.830084.000 for AN APPARATUS FOR DELIVER- 
ING PRECISE TRAPS AND INTERRUPTS IN AN OUT- 
OF-ORDER PROCESSOR filed concurrently herewith by 
Ramesh Panwar; Ser. No. 08/881,728 identified as Docket 
No. P2398/37 178. 830085. 000 for NON-BLOCKING 
HIERARCHICAL CACHE THROTTLE filed concurrently 
herewith by Ricky C. Hetherington and Thomas M. WicM; 
Ser. No. 08/881,727 identified as Docket No. P2406/ 
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37178.830086.000 for NON-THRASHABLE NON- 
BLOCKING HIERARCHICAL CACHE filed concurrendy 
herewith by Ricky C. Hetherington, Sharad Mehrotra and 
Ramesh Panwar; Ser. No. 08/881,065 identified as Docket 

5 No. P2408/37178.830087.000 for IN-LINE BANK CON- 
FLICT DETECTION AND RESOLUTION IN A MULTI- 
PORTED NON-BLOCKING CACHE filed concurrendy 
herewith by Ricky C. Hetherington, Sharad Mehrotra and 
Ramesh Panwar; and Ser. No. 08/882,613 identified as 

10 Docket No. P2434/371 78. 830088. 000 for SYSTEM FOR 
THERMAL OVERLOAD DETECTION AND PREVEN- 
TION FOR AN INTEGRATED CIRCUIT PROCESSOR 
filed concurrently herewith by Ricky C. Hetherington and 
Ramesh Panwar, the disclosures of which applications are 

15 herein incorporated by this reference. 

BACKGROUND OF THE INVENTION 

1, Field of the Invention 

This invention relates in general to microprocessors, and 
20 more particularly, to microprocessor architectures and meth- 
ods for speculatively translating logical register addresses to 
physical addresses in an out-of-order processor having reg- 
ister windows. 

2. Relevant Background 

Modern designs of computer processors (also called 
microprocessors) provide registers for storing data or for 
providing status or control information regarding the state of 
the processor. With respect to data registers for storing 

30 program data during execution within the processor, a 
variety of register organization structures exist. One way to 
organize registers within a processor is to use a register 
windowing technique to access a plurality of registers in a 
register file. With register windowing, a register window has 

35 a predetermined number of contiguous registers, and the 
window can be moved linearly within the register file. At 
any one time, the register window permits program access to 
a subset of the total number of registers in the register file. 
Control registers are also associated with the register win- 

4Q dows so that a program can manipulate the position of the 
window within the register file and monitor the status of the 
window. 

For example, in the specification for a scale able processor 
architecture, SPARC-V9, the general purpose registers for 

45 storing and manipulating data are arranged in register sets 
accessible through register windows, each register window 
having 32 registers. A particular processor can have multiple 
register sets ranging from three register sets to 32 register 
sets. Individual registers are addressable using a five-bit 

50 address in conjunction with a current window pointer 
(CWP). The register window is movable within the register 
sets such that a program can logically address multiple 
physical registers in the register sets by simply tracking a 
logical register name or specifier (i.e., r[3] or r[28]) and the 

55 current window pointer. 

The five-bit register addresses encoded in an instruction 
word specify the instruction's source registers and the 
destination register. These register specifiers are logical 
addresses that index registers within the current register 

60 window. Because the register window is movable within the 
larger register file, the physical address of each register 
specified by a instruction will depend on the location of the 
current register window within the register file. 
In a processor executing instructions speculatively or 

65 out-of-order, it is useful to track the physical addresses of the 
registers logically specified by an instruction. For instance, 
instruction dependency checking requires that instructions 
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referencing the same physical register are detected so that FIG. 6 illustrates the speculative window logic 600 for 

these instructions can be executed in the proper order to restoring the speculative copies of the window management 

eliminate the dependency. control registers with a window repair table 602 upon a 

Further, if instructions are speculatively processed within branch misprediction, in accordance with the present inven- 

the processor, handling an instruction which is down the s tion. 

wrong path or mispredicted branch which may affect the FIG. 7 illustrates a window repair table 602 in accordance 

position of the register window is problematic. ^ mc present mventioiL 

What is needed is a processor and method for specula- 
tively translating logical register addresses to physical DETAILED DESCRIPTION OF THE 
addresses accounting for the expected position of the reg- 10 PREFERRED EMBODIMENTS 
ister window within the register file. 

Instructions operating within a processor generally 

SUMMARY OF THE INVENTION specify a first source register, a second source register, and 

The above problems have been solved by maintaining a destination register. These registers are encoded within the 

speculative copies of the window management registers (for instruction using logical addresses (i.e., rO, r3, rl2). Because 

example, the CWP, CANSAVE, CANRESTORE registers) 15 it is necessary in an out-of-order processor to check for 

and using the speculative copies to map the logical registers dependencies between instructions prior to issuing the 

specified by an instruction into the physical registers from instructions for execution, the present invention specula- 

the windowed register file. The speculative copies of the lively converts the logical addresses of registers specified by 

window management registers are also used to determine the an instruction t0 the physical addresses of thc registers in 

occurrence of overflow and underflow traps associated with 20 Qrder to determine dependencies between instructions. The 

the window The speculative copy is always ahead of the atus and method of the nt invcntion ^ be 

architectural copy except at certain synchronization points f r ^ . , , . .f, c . ~ T „ 0 A _ 

u t ,i r7 ,*V , • described herein, particularly with reference to FIGS. 4-7. 

when both the speculative and architectural copies are > r j 

synchronized. If a branch misprediction occurs which affects Processor architectures can be represented as a collection 

the status or position of the window, a window repair table 25 of interacting functional units as shown in FIG. 1. These 

is used to restore the state of the speculative window functional units, discussed in greater detail below, perform 

management registers. the functions of fetching instructions and data from memory, 

In an apparatus implementation of the invention, a pro- preprocessing fetched instructions, scheduling instructions 

cessor is disclosed which executes instructions specifying to be executed, executing the instructions, managing 

logical addresses of a first source register, a second source 30 memory transactions, and interfacing with external circuitry 

register, and a destination register. The processor has a ant * devices. 

windowed register file of a plurality of registers, a portion of The present invention is described in terms of apparatus 

which are accessible through a window movable within the and methods particularly useful in a superpipelined and 

register file. The processor also has a window pointer superscalar processor 102 shown in block diagram form in 

register maintaining a value corresponding to the location of 35 FIG. 1 and FIG. 2. The particular examples represent 

the window in the register file, and a speculative window implementations useful in high clock frequency operation 

pointer register maintaining a speculative value of the win- and processors that issue and executing multiple instructions 

dow pointer register. The processor also has a controller and per cycle (IPC). However, it is expressly understood that the 

a mapper. The controller identifies an instruction expected to inventive features of the present invention may be usefully 

modify the value in the window pointer register, and in 40 embodied in a number of alternative processor architectures 

response to identifying the instruction, the controller modi- that will benefit from the performance features of the present 

fies the speculative value. The mapper is coupled to the invention. Accordingly, these alternative embodiments are 

speculative window pointer register and converts the equivalent to the particular embodiments shown and 

instruction specified logical addresses to physical addresses described herein. 

based on the speculative value contained in the speculative 45 FIG. 1 shows a typical general purpose computer system 

window pointer register. 100 incorporating a processor 102 in accordance with the 

The foregoing and other features, utilities and advantages present invention. Computer system 100 in accordance with 

of the invention will be apparent from the following more the present invention comprises an address/data bus 101 for 

particular description of a preferred embodiment of the communicating information, processor 102 coupled with 

invention as illustrated in the accompanying drawings. 50 bus 101 through input/output (I/O) device 103 for process- 

BRIEF DESCRIPTION OF THE DRAWINGS ing data and executin g instructions, and memory system 104 

coupled with bus 101 for storing information and instruc- 

FTG. 1 shows in block diagram form a computer in tions for processor 102. Memory system 104 comprises, for 

accordance with the present invention. example, cache memory 105 and main memory 107. Cache 

FIG. 2 shows a processor in block diagram in accordance 5S memory 105 includes one or more levels of cache memory, 

with the present invention. In a typical embodiment, processor 102, I/O device 103, and 

FIG. 3 illustrates a register file of 128 registers accessible some or all of cache memory 105 may be integrated in a 

through a movable 32-register window with a current win- single integrated circuit, although the specific components 

dow pointer (CWP). and integration density are a matter of design choice selected 

FIG. 4 shows a block diagram of the instruction renaming 60 to meet the needs of a particular application, 

unit 204 having instruction flattening logic and a depen- User I/O devices 106 are coupled to bus 101 and are 

dency checking module in accordance with the present operative to communicate information in appropriately 

invention. structured form to and from the other parts of computer 100. 

FIG. 5 illustrates the speculative window management User I/O devices may include a keyboard, mouse, card 

controller and the speculative current window pointer for 65 reader, magnetic or paper tape, magnetic disk, optical disk, 

each instruction in a bundle in accordance with the present or other available input devices, include another computer, 

invention. Mass storage device 117 is coupled to bus 101 may be 
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implemented using one or more magnetic hard disks, mag- mechanisms. In accordance with the present invention, the 

netic tapes, CD ROMs, large banks of random access instruction renaming mechanism is operative to map register 

memory, or the like. A wide variety of random access and specifiers in the instructions to physical register locations 

read only memory technologies are available and are equiva- to perform register renaming to prevent dependencies, 

lent for purposes of the present invention. Mass storage 117 s trfj 204 further comprises dependency checking mecha- 

may include computer programs and data stored therein. that analyze the instructions to determine if the 

Some or all of mass storage 117 may be configured to be operands (identified by the instructions' register specifiers) 

incorporated as a part of memory system 104. cannot be determined until another "live instruction" has 

In a typical computer system 100, processor 102, I/O completed. The term "live instruction" as used herein refers 

device 103, memory system 104, and mass storage device 1Q t0 ms tru C ti on that has been fetched from the instruction 

117, are coupled .to bus 101 formed on a printed circuit .board cach but flas not t leted or 5eeD retired> 

and integrated into a smgle housing as suggested by the mT ^ nM ' r . 

dashed-line box 108. However, the particular components IRU 204 outputs renamed instructions to instruction 

chosen to be integrated into a single housing is based upon scheduling unit (ISU) 206 and indicates any dependency 

market and design choices. Accordingly, it is expressly 1C which the instruction may have on other pnor five instruc- 

understood that fewer or more devices may be incorporated 15 ^ As will be described with reference to FIGS. 3-7, IRU 

within the housing suggested by dashed line 108. 2 ™ mechanisms to speculatively calculate the 

Display device 109 is used to display messages, data, a ^ dress f of ' e ^ ers s P ec / fied b * * n *° 

, f / j r . . r ' _ #t f „ that instruction dependencies can be properly detected. 

graphical or command line user interface, or other commu- r r r / 

nications with the user. Display device 109 may be ISU 206 receives renamed instructions from IRU 204 and 

implemented, for example, by a cathode ray tube (CRT) registers them for execution. ISU 206 is operative to sched- 

monitor, liquid crystal display (LCD) or any available ule and dispatch instructions as soon as their dependencies 

equivalent nave ^ een satisfied into an appropriate execution unit (e.g., 

FIG. 2 illustrates principle components of processor 102 execution unit QEV) 208 or floating-point and 

in greater detail in block diagram form. It is contemplated 25 S ra P hlcs unit ( FGU ) 210 >- ^ 206 also maintains trap status 

that processor 102 may be implemented with more or fewer of hve instructions. ISU 206 may perform other functions 

functional components and still benefit from the apparatus such as maintaining the correct architectural state of pro- 

and methods of the present invention unless expressly cessor 102 > ^eluding state maintenance when out-of-order 

specified herein. Also, functional units are identified using a instruction processing is used. ISU 206 may include mecha- 

precise nomenclature for ease of description and 30 nisms to redirect execution appropriately when traps or 

understanding, but other nomenclature often is often used to interrupts occur and to ensure efficient execution of rnu tip e 

identify equivalent functional units. ***** where multl P le operation is used. Multiple 

Instruction fetch unit (IFU) 202 comprises instruction | hr f ad ^ration means that processor 102 is running mul- 

fetch mechanisms and includes, among other things, an ^fT'f* m $ e * M P?*^ ^ simultaneously, 

instruction cache for storing instructions? branch prediction 35 ^P lc thread operation is consistent with but not required 

logic, and address logic for addressing selected instructions ^ me P resent invention. 

in the instruction cache. The instruction cache is commonly ISU 206 also operates to retire executed instructions when 

referred to as a portion (1$) of the level one (LI) cache with completed by IEU 208 and FGU 210. ISU 206 performs the 

another portion (D$) of the LI cache dedicated to data appropriate updates to architectural register files and condi- 

storage. IFU 202 fetches one or more instructions at a time 40 tion code registers upon complete execution of an instruc- 

by appropriately addressing the instruction cache. The tion. ISU 206 is responsive to exception conditions and 

instruction cache feeds addressed instructions to instruction discards or flushes operations being performed on instruc- 

rename unit (IRU) 204. Preferably, IFU 202 fetches multiple tions subsequent to an instruction generating an exception in 

instructions each cycle and in a specific example fetches the program order. ISU 206 quickly removes instructions 

eight instructions each cycle^cnown as an instruction 45 a mispredicted branch and initiates IFU 202 to fetch 

bundle. Any number of instructions may be included in a the correct branch. An instruction is retired when it has 

bundle to meet the needs of a particular application. finished execution and all instructions from which it depends 

In the absence of conditional branch instruction, IFU 202 have completed. Upon retirement the instruction's result is 

addresses the instruction cache sequentially. The branch writte n into the appropriate register file and is no longer 

prediction logic in IFU 202 handles branch instructions, so deemed a " hve mstruction". 

including unconditional branches. An outcome tree of each IEU 208 includes one or more pipelines, each pipeline 

branch instruction is formed using any of a variety of comprising one or more stages that implement integer 

available branch prediction algorithms and mechanisms. instructions. IEU 208 also includes mechanisms for holding 

More than one branch can be predicted simultaneously by the results and state of speculatively executed integer 

supplying sufficient branch prediction resources. After the ss instructions. IEU 208 functions to perform final decoding of 

branches are predicted, the address of the predicted branch integer instructions before they are executed on the execu- 

is applied to the instruction cache rather than the next tion units and to determine operand bypassing amongst 

sequential address. If a branch is mispredicted, the instruc- instructions in an out-of-order processor. IEU 208 executes 

tions processed from the mispredicted branch are flushed all integer instructions including determining correct virtual 

from the processor, and the processor state is restored to the 60 addresses for load/store instructions. IEU 208 also maintains 

state prior to the mispredicted branch. For instructions correct architectural register state for a plurality of integer 

which affect the speculative calculation of the physical registers in processor 102. IEU 208 preferably includes 

address of a register, restoration of the processor's window mechanisms to access single and/or double-precision archi- 

management registers is discussed below with reference to tectural registers as well as single and/or double -precision 

FIGS. 6-7. 65 rename registers. 

IRU 204 comprises one or more pipeline stages that The floating point graphics and execution unit FGU 210 

include instruction renaming and dependency checking includes one or more pipelines, each comprising one or more 
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stages that implement floating-point instructions. FGU 210 allows privileged software to save the occupied register 

also includes mechanisms for holding the results and state of window in memory, thereby making the window available 

speculatively executed floating-point and graphic instruc- for use. A window underflow occurs when a RESTORE 

tions. FGU 210 functions to perform final decoding of instruction is executed and the previous register window 

floating-point instructions before they are executed on the 5 contains no validly saved register data. An underflow causes 

execution units and to determine operand bypassing a fill trap or exception that allows privileged software to load 

amongst instructions in an out-of-order processor. In the the window registers from memory, 

specific example, FGU 210 includes one or more pipelines A savable windows register (CANSAVE) contains the 

dedicated to implement special purpose multimedia and number of register windows following the CWP that are not 

graphic instructions that are extensions to standard archi- 10 in use and are available for allocation by a SAVE instruction 

tectural instructions for a processor. FGU 210 may be without generating a window spill exception. A restorable 

equivalently substituted with a floating-point unit (FPU) in windows register (CANRESTORE) contains the number of 

designs in which special purpose graphic and multimedia register windows preceding the CWP that are in use by the 

instructions are not used. FGU 210 preferably includes current program and can be restored, via a RESTORE 

mechanisms to access single and/or double-precision archi- 15 instruction, without generating a window fill exception, 

tectural registers as well as single and/or double-precision [f me CANSAVE register equals 0, execution of a SAVE 

rename registers. instruction causes a window spill (overflow) exception. If 

A data cache memory unit (DCU) 212, including cache the SAVE instruction does not cause an exception, and a new 

memory 105 shown in FIG. 1, functions to cache memory register window is allocated, the CWP register is 

reads from off-chip memory through external interface unit 20 incremented, the CANSAVE register is decremented, and 

(EIU) 214. Optionally, DCU 212 also caches memory write the CANRESTORE register is incremented, 

transactions. DCU 212 comprises one or more hierarchical [f the CANRESTORE register equals 0, execution of a 

levels of cache memory and the associated logic to control RESTORE instruction causes a window fill (underflow) 

the cache memory. One or more of the cache levels within exception. If the RESTORE instruction does not cause an 

DCU 212 may be read only memory to eliminate the logic 25 exception, the previous register window is restored by 

associated with cache writes. decrementing the CWP register, the CANRESTORE value is 

The apparatus and method for speculatively translating decremented, and the CANSAVE value is incremented, 

logical register addresses to physical addresses in accor- The state of the register windows is determined by a set 

dance with the present invention is implemented primarily in of privileged window management registers comprising the 

the instruction renaming unit IRU 204. 30 CWP register, the CANSAVE register, and the CANRE- 

Referring to FIG. 3, a register file 300 having 128 STORE register. A write privileged register "WRPR" 
registers is shown with a window 302 having 32 registers. instruction permits writing data to these privileged registers. 
Window 302 is movable within the register file 300 by a In accordance with the present invention, speculative 
program or process executing on processor 102. For 35 copies of these window management registers are main- 
example, different processes running within processor 102 tained so that when a SAVE or RESTORE instruction is 
could allocate their own register window 302 to access 32 detected in an instruction bundle from the IFU 202, the 
registers independent of the other processes executing proper physical address of a register specified by a subse- 
within the processor. quent instruction in the bundle can be speculatively calcu- 

An individual register within window 302 is physically 4Q lated. 
accessible through register address 304 and current window FIG. 4 illustrates a block diagram showing instruction 
pointer (CWP) 306. Because the window 302 has 32 flattening logic 400 for processing the logical addresses 
registers, the register address 304 will be a 5-bit address. A specified by instructions in an instruction bundle 402. Map- 
program, however, would access the registers through a ping logical addresses to physical addresses is referred 
typical naming convention such as r0, rl, r2 . . . r29, r30, and 45 herein as "flattening" the windowed registers. When the 
r31. In this sense, the current window pointer 306 acts as an windowed registers are flattened, each register is uniquely 
offset to address the registers contained in the current identified by physical address. 

window 302. While register file 300 has been shown having As mentioned above, the instruction bundle 402 contains 

128 registers, and window 302 has been shown as having 32 up to eight instructions, each instruction specifying source 

registers, it will be understood that the size of the register file 50 registers and destination registers using logical addresses, 

and register windows is a matter of choice depending upon The instruction flattener logic 400 speculatively converts the 

the needs of a particular application, and as such do not limit incoming logical register addresses 404 of an instruction 

the present invention. into their expected actual physical addresses 406. 

In SPARC, certain instructions and architectural status The physical register addresses 406 are used by depen- 

registers relate to management of the register windows. As 55 dency checker 408 to determine any true register/data 

discussed above, a current window pointer (CWP) is main- dependencies between instructions, so that the instructions 

tained in a CWP register to track the current location of the can be properly scheduled by the instruction scheduling unit 

window within the register file. A "SAVE" instruction alio- 206 for execution within the processor. The ISU 206 will 

cates a new register window to the routine executing it, and schedule instructions such that any instructions dependent 

saves the prior register window by incrementing the CWP $q upon the completion of other instructions will be scheduled 

register. A "RESTORE" instruction restores the previous for execution in the proper order. Instructions which have no 

register window (i.e., the register window saved by the last dependency on prior instructions can be scheduled out-of- 

SAVE instruction executed by the current process) by dec- order to improve the performance of the processor, 

rementing the CWP register. FIG. 5 illustrates a block diagram of an embodiment of 

A window overflow occurs when a SAVE instruction is 65 the instruction flattener logic 400 in accordance with the 

executed and the next register window is unavailable or present invention. A logical to physical (L2P) mapper 500 

occupied. An overflow causes a spill trap or exception that maps a logical address 404 of a register in a register window 
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to a physical address 406. Each instruction (shown as 10, indicates that the instruction is a WRPR privileged instruc- 

II, . . . 17) in the bundle 402 has an L2P mapper 500 for each tion to a window management register, 

register specifier, and a speculative window management The speculative copies of the window management reg- 

controller (SWMC) 502 for the op-code 504 of the instruc- fsters are generally out of synchronization with the values in 

tion. Each instruction in the bundle 402 also has a specula- 5 the CWP, CANSAVE, and CANRESTORE architectural 

tive copy 506 of the current window pointer 306, registers. Synchronizing the speculative and architectural 

As previously mentioned, each instruction within the copies of the window management registers is needed when 

instruction bundle 402 can have a first source register (rsl), the architectural copy is modified by a privileged WRPR 

a second source register (rs2), and a destination register (rd). instruction; after a window trap; or when the speculative 

Because these registers are specified within the instruction 1° CO py needs to be reset back to a previous state after a branch 

using logical addresses, the L2P mapper 500 converts the misprediction. 

logical register address 404 into a physical register address "When the controller 502 detects that the present instruc- 

406 of a specified register. The L2P mapper 500 uses a ti on win affect the window management registers (but is not 

speculative copy 506 of the current window pointer 306 to a privileged instruction) and will not cause a window trap, 

perform the conversion. 1 5 the controller asserts the stall output signal, and either the 

Since there are eight instructions in an incoming bundle of shift left signal or the shift right signal to alter the specu- 

instructions, and each instruction in the bundle can specify lative value of the CWP for subsequent instructions. When 

up to three registers in the instruction, each instruction in an the stall signal is asserted, the instructions in the bundle that 

instruction bundle has three L2P mappers 500. A first follow the present instruction are stalled, 

mapper translates the logical address of the first source 20 Wricn an instruction in the bundle is a WRPR privileged 

register rsl to a physical address; a second mapper translates instruction which affects a window management register, the 

the logical address of a second source register rs2 to a controller 502 asserts the WRPR signal if none of the 

physical address; and a third mapper translates the logical proceeding controllers have an asserted stall signal. In this 

address of the destination register rd specified in the instruc- case> ^ instructions in the same bundle are stalled. The stall 

tion to a physical address. 25 win s t a y in effect until the WRPR instruction is executed and 

Because processor 102 utilizes a register file supporting the target window management register is updated. The stall 

multiple register windows movable within the register file, also stays in effect until the speculative copy of the window 

and because any instruction within the bundle can contain an management registers are synchronized with the values in 

instruction which is expected to shift the register window ^ the CWP, CANSAVE, and CANRESTORE architectural 

(i.e., a SAVE or RESTORE instruction), the present inven- registers. Stalling the pipeline in this instance is acceptable 

tion detects if an instruction within the bundle will shift or because the WRPR privileged write instruction to window 

otherwise affect the location or status of the current register management registers is not expected to occur very often, 

window. If so, the speculative copy 506 of the window Windowing traps such as a window spill or fill exception 

pointer is altered to permit speculative calculation of the ^ can b e speculatively determined by controller 502. The 

physical address of the register. controller maintains a speculative copy of the CWP register, 

If, for example, the first instruction in the instruction along with speculative copies of the CANSAVE and CAN- 

bundle would shift the window pointer (i.e., a SAVE RESTORE registers (shown in FIG. 6). For example, if the 

instruction), then in order to properly calculate speculatively CANSAVE register equals zero and the controller detects a 

the physical address of the registers specified by subsequent 4Q SAVE instruction, a window spill exception will be gener- 

instructions in the bundle, the speculative copy of the current ated when the SAVE instruction is executed, 

window pointer is altered for each subsequent instruction in When an instruction expected to cause a window trap is 

the bundle. This speculative copy of the current window detected by controller 502, the trap signal is asserted if none 

pointer is then used to calculate the physical register 0 f tne proceeding controllers have the stall signal asserted, 

addresses of each register specified in the instructions. 45 j n one embodiment of the invention, detection of a window 

The controller 502 identifies an instruction expected to trap causes cancellation of the instruction that caused the 

modify the state of the register window. Controller 502 has trap, and instructions following are canceled as well, until a 

as an input the speculative copy 506 of the current window bundle is received that contains the appropriate trap handler, 

pointer 306. Controller 502 also utilizes the op-code 504 of A message can be sent to the instruction scheduling unit ISU 

the particular instruction in the bundle to determine if the 50 notifying it of the trap. This message could contain all of the 

instruction is expected to affect the position or status of the information needed to calculate the trap handler base 

current register window. Controller 502 can also anticipate address which is passed back to the instruction fetch unit 

window traps which are expected to occur in overflow or IFU. The IFU then starts fetching the trap handler. When the 

underflow conditions. As will be explained below with trap is completely serviced by the trap handler routine, the 

reference to FIG. 6 and 7, controller 502 can also restore the 55 speculative copies and the architectural copies of the CWP, 

speculative copies of the CWP, CANSAVE, and CANRE- CANSAVE, and CANRESTORE registers are synchro- 

STORE registers if the controller modified the values of the nized. 

speculative copies in response to instructions in a branch Referring to FIG. 6, in order to handle the possibility of 

later determined mispredicted. a mispredicted instruction in the instruction bundle affecting 

The outputs of controller 502 comprise a shift left signal 60 the current window, IRU 204 has speculative window logic 

510A, a shift right signal 510B, a stall signal 5 10C, a trap 600 which maintains a local "speculative" copy of the 

signal 510D, and a privileged WRPR signal 510E. architectural window management registers. These architec- 

The shift left signal increments the speculative copy of the tural registers comprise the CWP register 610 (shown as 306 

CWP for all subsequent instructions, and the shift right in FIGS. 3 and 5), the CANSAVE register 612, and the 

signal decrements the speculative copy of the CWP. The trap 65 CANRESTORE register 614. The speculative copies 

signal indicates the present instruction is expected to cause include a speculative current window pointer (S_CWP) 

a window trap (i.e., a spill or fill trap). The WRPR signal register 620, a speculative CANSAVE (S_CANSAVE) reg- 
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ister 622, and a speculative CANRESTORE 
(S_CANRESTORE) register 624. 

The speculative copies of the architectural window man- 
agement registers are generally ahead of the processor (i.e., 
ahead of the values stored in the architectural window s 
management registers), since the speculative copies are used 
to speculatively calculate the physical addresses of registers 
specified by an instruction in a bundle. 

As previously mentioned, instructions following condi- 
tional branch instructions are fetched and speculatively 10 
processed within the processor. However, while these 
instructions following a predicted branch are being 
processed, it is not absolutely certain that these instructions 
will be executed (i.e., the branches are unconfirmed). A 
mispredicted branch could contain instructions which the 
controller received and then modified the speculative win- 
dow registers such as the speculative copy 620 of the CWP. 

In accordance with the present invention, a window repair 
table 602 is utilized for ensuring the proper restoration of the 
speculative copies of the architectural window management 
registers upon a branch misprediction. FIG. 7 illustrates an 
embodiment of the window repair table 602. The table 
contains an entry for each branch instruction received by the 
IRU. A branch identification (BID) field 700 identifies the 
particular branch instruction. Along with the BID, the table 
contains the values of the speculative registers at the time the 
branch instruction was received by the IRU. Fields 720, 722, 
and 724 are shown containing the values of the S_CWP, 
S_CANSAVE, and S_CANRESTORE. These values are 
essentially backup copies of the values in the speculative 
registers at the time the branch instruction was detected. 

Upon a branch mispredict, the state of the speculative 
registers is restored to their respective states prior to pro- 
cessing the instructions of the mispredicted branch. The 
speculative window logic 600 (FIG. 6) copies the values 
720, 722, and 724 from the window repair table 602 
(corresponding to the mispredicted branch) to the specula- 
tive registers 620, 622, and 624. In this way, the speculative 
registers are restored to their values prior to possible cor- 
ruption due to processing of mispredicted instructions. 4Q 

While the invention has been particularly shown and 
described with reference to a preferred embodiment thereof, 
it will be understood by those skills in the art that various 
other changes in the form and details may be made without 
departing from the spirit and scope of the invention. For 45 
instance, while the present invention has been described 
with reference to a processor architecture shown in FIG. 2, 
it will be understood that the present invention could be used 
in other equivalent processor designs. 

What is claimed is: 

1. A processor executing instructions specifying logical 
addresses of a first source register, a second source register, 
and a destination register, the processor comprising: 

a register file comprising a plurality of registers, a portion 
of said registers accessible through a window movable 55 
within said register file, each register uniquely identi- 
fied within said window by a logical address; 

a window pointer register maintaining a value corre- 
sponding to the location of said window in said register 
file; 60 

a speculative window pointer register maintaining a 
speculative value of said window pointer register; 

a controller identifying an instruction expected to modify 
the value in the window pointer register, and in 
response to identifying said instruction, the controller 65 
modifies the speculative value of said window pointer 
register; and 
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a mapper coupled to said speculative window pointer 
register, the mapper converting said logical addresses 
to physical addresses based on the speculative value 
contained in the speculative window pointer register. 

2. The processor of claim 1, wherein said controller 
detects a privileged instruction modifying said window 
pointer register, and in response to detecting said privileged 
instruction the controller updates the speculative value with 
the value stored in the window pointer register. 

3. The processor of claim 1, wherein said controller 
detects that said instruction resulted from a mispredicted 
branch instruction. 

4. The processor of claim 1, wherein said controller 
detects that said instruction is expected to cause a window 
overflow exception. 

5. The processor of claim 1, wherein said controller 
detects that said instruction is expected to cause a window 
underflow exception. 

6. A processor executing instructions specifying logical 
addresses of a first source register, a second source register, 
and a destination register, the processor comprising: 

a register file comprising a plurality of registers, a portion 
of said registers accessible through a window movable 
within said register file, each register uniquely identi- 
fied within said window by a logical address; 

a window pointer register maintaining a value corre- 
sponding to the location of said window in said register 
file; 

a speculative window pointer register maintaining a 
speculative value of said window pointer register; 

a controller identifying an instruction expected to modify 
the value in the window pointer register, and in 
response to identifying said instruction the controller 
modifies the speculative value; 

a mapper coupled to said speculative window pointer 
register, the mapper converting said instruction speci- 
fied logical addresses to physical addresses based on 
the speculative value contained in the speculative win- 
dow pointer register; and 

a savable window register maintaining a value corre- 
sponding to a number of register windows available for 
use in said register file; and a speculative window 
register maintaining a speculative value of said savable 
window register, 

7. The processor of claim 6, wherein said controller 
speculatively detects a window overflow exception upon 
detecting said speculative window register indicates there 
are no register windows available and said controller iden- 
tifies a SAVE instruction. 

8. A processor executing instructions specifying logical 
addresses of a first source register, a second source register, 
and a destination register, the processor comprising: 

a register file comprising a plurality of registers, a portion 
of said registers accessible through a window movable 
within said register file, each register uniquely identi- 
fied within said window by a logical address; 

a window pointer register maintaining a value corre- 
sponding to the location of said window in said register 
file; 

a speculative window pointer register maintaining a 
speculative value of said window pointer register; 

a controller identifying an instruction expected to modify 
the value in the window pointer register, and in 
response to identifying said instruction the controller 
modifies the speculative value; 
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a mapper coupled to said speculative window pointer 
register, the mapper converting said instruction speci- 
fied logical addresses to physical addresses based on 
the speculative value contained in the speculative win- 
dow pointer register; 5 

a restorable window register maintaining a value corre- 
sponding to a number of stored register windows in 
said register file available for restoration; and 

a speculative restorable window register maintaining a 
speculative value of said savable window register. 

9. The processor of claim 8, wherein said controller 
speculatively detects a window underflow exception upon 
detecting said speculative restorable window register indi- 
cates there are no register windows for restoration and said 
controller identifies a RESTORE instruction. 15 

10. The processor of claim 1, further comprising: 

a table storing a plurality of entries, each entry specifying 
a branch instruction and a backup speculative copy of 
the speculative value of said window pointer register. 2Q 

11. The processor of claim 10, wherein said controller 
updates the speculative window pointer register with said 
backup speculative copy upon detecting a branch mispre- 
diction corresponding to said branch instruction. 

12. A computer system comprising: 25 
a memory system; 

a processor coupled to said memory system, the processor 
executing instructions specifying logical addresses of a 
first source register, a second source register, and a 
destination register, the processor comprising: 30 

a register file comprising a plurality of registers, a portion 
of said registers accessible through a window movable 
within said register file, each window uniquely identi- 
fied within said window by a logical address; 



a window pointer register maintaining a value corre- 
sponding to the location of said window in said register 
file; 

a speculative window pointer register maintaining a 
speculative value of said window pointer register; 

a controller identifying an instruction expected to modify 
the value in the window pointer register, and in 
response to identifying said instruction, the controller 
modifies the speculative value of said window pointer 
register; and 

a mapper coupled to said speculative window pointer 
register, the mapper converting said instruction speci- 
fied logical addresses to physical addresses based on 
the speculative value contained in the speculative win- 
dow pointer register. 

13. The computer system of claim 12, wherein said 
controller detects a privileged instruction modifying said 
window pointer register, and in response to detecting said 
privileged instruction the controller updates the speculative 
value with the value stored in the window pointer register. 

14. The computer system of claim 13, wherein said 
controller detects that said instruction resulted from a 
mispredicted branch instruction. 

15. The computer system of claim 14, wherein said 
controller detects that said instruction is expected to cause a 
window overflow exception. 

16. The computer system of claim 15, wherein said 
controller detects that said instruction is expected to cause a 
window underflow exception. 
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